[v22.x] src: clamp WriteUtf8 capacity to INT_MAX in EncodeInto by semimikoh · Pull Request #62621 · nodejs/node

semimikoh · 2026-04-07T00:51:56Z

Backport notice

This is a v22.x-only patch, not a cherry-pick. The bug it fixes was
incidentally resolved on main / v24.x by #58070, which migrated
EncodeInto from v8::String::WriteUtf8 to WriteUtf8V2. The
intent of #58070 was deprecation cleanup, not bug fixing.

WriteUtf8V2 is not available in v22.x's bundled V8 version, so
backporting #58070 in full is not feasible. This PR instead applies
a minimal scoped fix to EncodeInto only.

Problem

In TextEncoder.encodeInto, the destination buffer's byte length
is read as size_t but then implicitly narrowed to int when
passed as the capacity argument to v8::String::WriteUtf8. When
the destination view is larger than INT_MAX (2,147,483,647
bytes), the narrowing conversion underflows to a negative value.
V8 treats this as "no capacity available" and writes 0 bytes,
returning { read: 0, written: 0 } even though the buffer has
ample space.

Reproduction on v22.x (from #62610):

```js
const _TE = new TextEncoder();
const s = "aÿ我𝑒";
const u8a = new Uint8Array(2429682061); // ~2.4 GB
const offset = 38928786;

_TE.encodeInto(s, u8a.subarray(offset));
// v22.x: { read: 0, written: 0 } ← bug
// v24.x: { read: 5, written: 10 } ← correct

_TE.encodeInto(s, u8a.subarray(offset, offset + 10));
// Both versions: { read: 5, written: 10 }
// (works because the bounded view is only 10 bytes, well within int range)
```

This is a silent data loss bug — no error is thrown, the caller
just gets a successful-looking return value with zero bytes written.

Fix

Clamp the capacity passed to WriteUtf8 to INT_MAX in
EncodeInto. The source string in encodeInto is bounded in
practice and never requires more than INT_MAX bytes to encode;
only the destination view length can exceed INT_MAX.

Testing

Built locally on v22.x-staging (v22.22.2-pre) and verified:

✅ Original reproduction from TextEncoder.encodeInto NOT work on BIG utf8 subarray ON 22.4.1 #62610 now returns { read: 5, written: 10 } for both unbounded and bounded subarrays
✅ New pummel regression test (test/pummel/test-whatwg-encoding-encodeinto-large.js) passes with the patched binary. Follows existing pummel conventions: skips on 32-bit architectures via common.skipIf32Bits(), skips on insufficient os.totalmem(), and skips on allocation failure
✅ test/parallel/test-whatwg-encoding*, test-textencoder*, test-textdecoder* — 11 tests pass
✅ test/parallel/test-buffer-*, test-string-decoder* — 69 tests pass
✅ test/pummel/test-whatwg-encoding* — passes
✅ cpplint.py on the modified file — clean
✅ make lint-js — clean

Refs: #58070
Fixes: #62610

Renegade334 · 2026-04-07T01:53:49Z

Please also ensure your commits are signed off.

Renegade334

Implementation LGTM, I think the tests could do with being simplified though.

In TextEncoder.encodeInto, the destination buffer's byte length is read as a size_t but then implicitly narrowed to int when passed as the capacity argument to v8::String::WriteUtf8. When the destination view is larger than INT_MAX (2,147,483,647 bytes), the narrowing conversion underflows to a negative value, V8 treats it as "no capacity", and writes 0 bytes - returning { read: 0, written: 0 } even though the buffer has plenty of room. Clamp the capacity to INT_MAX before passing it to WriteUtf8. This is sufficient because the source string in encodeInto is bounded in practice and never requires more than INT_MAX bytes to encode; only the destination view length can exceed INT_MAX. This issue is already fixed on main and v24.x as a side effect of PR nodejs#58070, which migrated to the non-deprecated WriteUtf8V2 method whose capacity parameter is size_t. WriteUtf8V2 is not available in v22.x's V8 version, so this minimal patch fixes only the EncodeInto path instead of backporting the full migration. Refs: nodejs#58070 Fixes: nodejs#62610 Signed-off-by: semimikoh <ejffjeosms@gmail.com>

Drop the bounded subarray case, remove the unnecessary error code check, and use a 2**31 byte Uint8Array directly instead of slicing an offset subarray. Signed-off-by: semimikoh <ejffjeosms@gmail.com>

Renegade334

👍

nodejs-github-bot · 2026-04-13T16:16:09Z

CI: https://ci.nodejs.org/job/node-test-pull-request/72673/

nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. v22.x Issues that can be reproduced on v22.x or PRs targeting the v22.x-staging branch. labels Apr 7, 2026

Renegade334 reviewed Apr 7, 2026

View reviewed changes

Comment thread test/pummel/test-whatwg-encoding-encodeinto-large.js Outdated

Comment thread test/pummel/test-whatwg-encoding-encodeinto-large.js Outdated

Comment thread src/encoding_binding.cc Outdated

Comment thread test/pummel/test-whatwg-encoding-encodeinto-large.js Outdated

Renegade334 changed the title ~~[v22.x backport] src: clamp WriteUtf8 capacity to INT_MAX in EncodeInto~~ [v22.x] src: clamp WriteUtf8 capacity to INT_MAX in EncodeInto Apr 7, 2026

Renegade334 reviewed Apr 7, 2026

View reviewed changes

Comment thread test/parallel/test-whatwg-encoding-encodeinto-large.js Outdated

Comment thread test/parallel/test-whatwg-encoding-encodeinto-large.js Outdated

Comment thread test/parallel/test-whatwg-encoding-encodeinto-large.js Outdated

semimikoh added 2 commits April 8, 2026 07:50

test: simplify encodeInto large buffer regression test

b1de92b

Drop the bounded subarray case, remove the unnecessary error code check, and use a 2**31 byte Uint8Array directly instead of slicing an offset subarray. Signed-off-by: semimikoh <ejffjeosms@gmail.com>

semimikoh force-pushed the fix-encodeinto-intmax branch from bb48d92 to b1de92b Compare April 7, 2026 22:50

Renegade334 approved these changes Apr 7, 2026

View reviewed changes

Renegade334 added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. request-ci Add this label to start a Jenkins CI on a PR. labels Apr 13, 2026

github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[v22.x] src: clamp WriteUtf8 capacity to INT_MAX in EncodeInto#62621

[v22.x] src: clamp WriteUtf8 capacity to INT_MAX in EncodeInto#62621
semimikoh wants to merge 2 commits intonodejs:v22.x-stagingfrom
semimikoh:fix-encodeinto-intmax

semimikoh commented Apr 7, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Renegade334 commented Apr 7, 2026

Uh oh!

Renegade334 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Renegade334 left a comment

Uh oh!

nodejs-github-bot commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

semimikoh commented Apr 7, 2026

Backport notice

Problem

Fix

Testing

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Renegade334 commented Apr 7, 2026

Uh oh!

Renegade334 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Renegade334 left a comment

Choose a reason for hiding this comment

Uh oh!

nodejs-github-bot commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants